Skip to content

fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923

Open
davison wants to merge 1 commit intopaperclipai:masterfrom
radiusred:fix/docker-volumes
Open

fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors#1923
davison wants to merge 1 commit intopaperclipai:masterfrom
radiusred:fix/docker-volumes

Conversation

@davison
Copy link
Copy Markdown

@davison davison commented Mar 27, 2026

Thinking Path

Paperclip orchestrates AI agents for zero-human companies — it runs persistent processes that read and write agent state to a data directory.

To run Paperclip in a hosted or cloud environment, operators use Docker — mounting a host directory as /paperclip so data persists across container restarts.

The container runs as the node user (UID 1000). When the host directory is owned by a different UID (any non-root user on many Linux distros), writes to the mounted volume fail with permission errors. This is a very common problem, raised repeatedly in the community — as seen again recently in #dev.

The standard Docker solution is to remap the container user's UID/GID at startup to match the host directory owner — well-established practice, typically implemented with gosu and a minimal shell entrypoint.

The existing Dockerfile has no mechanism for this: it uses a static USER node directive with no entrypoint script, so there is no opportunity to perform the remap.

This PR adds USER_UID/USER_GID build args for build-time pre-baking, installs gosu, and adds a minimal docker-entrypoint.sh that re-checks and remaps the UID/GID at container startup before dropping privileges and exec-ing the server.

What Changed

  • Dockerfile
    • Added ARG USER_UID=1000 and ARG USER_GID=1000 build args (defaults to 1000 — no behaviour change for existing deployments)
    • Added gosu to the base apt install
    • Added usermod/groupmod at build time to pre-bake the UID/GID into the image layer
    • Added USER_UID and USER_GID to the ENV block so they are available at runtime
    • Replaced static USER node directive with ENTRYPOINT ["docker-entrypoint.sh"]
    • Copies docker-entrypoint.sh to /usr/local/bin/ and makes it executable
  • docker-entrypoint.sh (new file)
    • Reads USER_UID/USER_GID env vars at startup
    • Calls usermod/groupmod if the running UID/GID differ from the node user (handles runtime override without rebuild)
    • Calls chown -R node:node /paperclip to correct volume ownership
    • Drops to node user via exec gosu node "$@" — no persistent root process

Note on customisation: A question that keeps coming up is how to add tools, MCP servers, or extra packages to the image (e.g. for specific agent adapters). The cleanest pattern is to build a downstream image using Paperclip as a base:

FROM paperclipai/paperclip:latest
# Install your tools on top
RUN apt-get update && apt-get install -y ripgrep && rm -rf /var/lib/apt/lists/*

This PR keeps the upstream image minimal and focused. Documenting this pattern properly is a good follow-up.

Verification

Build and run with a non-default UID:

# Build with custom UID/GID baked in
docker build --build-arg USER_UID=1001 --build-arg USER_GID=1001 -t paperclip-test .

# Or override at runtime without rebuild
docker run -e USER_UID=1001 -e USER_GID=1001 \
  -v /host/paperclip/data:/paperclip \
  paperclip-test

# Confirm server runs as expected UID
docker exec <container> id
# uid=1001(node) gid=1001(node)

# Confirm volume ownership is corrected
docker exec <container> ls -la /paperclip
# drwxr-xr-x node node ...

Default build (no args): identical behaviour to the previous image — server runs as node (UID 1000), entrypoint is a no-op remap and proceeds immediately to exec gosu node.

Risks

  • Entrypoint runs briefly as root before the gosu drop — this is intentional and necessary for usermod/chown. The server process itself never runs as root. This is the standard pattern for this use case (Bitnami images, LinuxServer.io images, official Redis/Postgres images all do the same).
  • gosu adds a small dependency (~1 MB). gosu is a well-known, audited privilege-drop tool that avoids the TTY and signal-handling issues of su/sudo in containers.
  • Existing deployments unaffected — the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead.

Checklist

  • Thinking path traces from project context to change
  • Tests run locally and pass
  • Tests added or updated where applicable (no unit tests exist for the Dockerfile; this is a runtime change — verification steps above cover it)
  • Before/after screenshots included (N/A — no UI change)
  • Risks considered and documented
  • Will address all Greptile and reviewer comments before merge

@davison davison changed the title fix: remap container UID/GID at runtime to avoid volume mount permission errors fix(docker): remap container UID/GID at runtime to avoid volume mount permission errors Mar 27, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 27, 2026

Greptile Summary

This PR adds runtime UID/GID remapping to the Docker image so that the node user inside the container can be aligned with the host directory owner at startup, eliminating the common volume-mount permission errors. It installs gosu, adds build-time ARG/usermod/groupmod pre-baking in the base stage, and introduces a minimal docker-entrypoint.sh that conditionally remaps and chowns only when the UID/GID actually differ — immediately exec-ing the server otherwise.

The three issues raised in the previous review rounds have all been addressed cleanly:

  • ARG USER_UID/USER_GID are now re-declared in the production stage and correctly forwarded through ENV USER_UID=${USER_UID} — build-baked values are no longer silently overwritten at runtime.
  • The chown -R node:node /paperclip call is now guarded by a changed flag, so default-UID containers skip it entirely.
  • usermod -g \"$PGID\" node is called after groupmod to update the primary GID field in /etc/passwd, ensuring gosu node starts with the intended GID.

Two minor P2 suggestions remain:

  • ENTRYPOINT [\"docker-entrypoint.sh\"] should use the full absolute path /usr/local/bin/docker-entrypoint.sh to follow Docker conventions and avoid any PATH-dependency.
  • The entrypoint has no guard against PUID=0 or PGID=0, which would silently remap node to root and make the privilege drop a no-op.

Confidence Score: 5/5

Safe to merge — all previously raised blocking issues are resolved; remaining findings are minor style and hardening suggestions.

All three P1 issues from prior rounds (ARG propagation, unconditional chown, broken primary GID) are correctly fixed in the latest push. The two remaining comments are P2 (absolute ENTRYPOINT path convention and a missing zero-UID guard), neither of which causes incorrect behaviour in normal use. The core UID/GID remapping logic is sound and follows the standard gosu pattern.

No files require special attention.

Important Files Changed

Filename Overview
Dockerfile Adds gosu, build-time ARG/usermod/groupmod in the base stage, re-declares ARGs in production stage to correctly propagate USER_UID/USER_GID into ENV, copies and installs the entrypoint script, and replaces the static USER directive with the new ENTRYPOINT. Minor: ENTRYPOINT uses a relative name instead of the full absolute path.
docker-entrypoint.sh New entrypoint that conditionally remaps node UID/GID at runtime, guards the recursive chown behind a changed flag (avoiding startup latency on default-UID containers), fixes primary GID via usermod -g after groupmod, then drops to node via exec gosu. No validation that PUID/PGID are non-zero.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 73

Comment:
**ENTRYPOINT should use absolute path**

`ENTRYPOINT ["docker-entrypoint.sh"]` relies on `PATH` lookup at runtime. While `/usr/local/bin` is in `PATH` for this image and it works in practice, the canonical Docker convention (and what linters like Hadolint enforce) is to use the full absolute path in exec-form `ENTRYPOINT`. Using a relative name is fragile if anyone customises the image's `PATH`.

```suggestion
ENTRYPOINT ["/usr/local/bin/docker-entrypoint.sh"]
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 5-6

Comment:
**No guard against `PUID=0` / `PGID=0`**

If an operator accidentally (or deliberately) sets `USER_UID=0` or `USER_GID=0`, `usermod` will remap `node` to UID/GID 0. `exec gosu node` then starts the server process as root — the privilege-drop silently becomes a no-op. This defeats the security model of the whole entrypoint.

The standard pattern (used by linuxserver.io and Bitnami images) is to validate and reject 0 early:

```sh
if [ "$PUID" = "0" ] || [ "$PGID" = "0" ]; then
    echo "ERROR: USER_UID and USER_GID must not be 0 (root). Refusing to start." >&2
    exit 1
fi
```

Adding this check after lines 5–6 makes the failure loud and obvious rather than silent.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (4): Last reviewed commit: "fix: support host UID/GID mapping for vo..." | Re-trigger Greptile

Dockerfile Outdated
Comment on lines +61 to +62
USER_UID=1000 \
USER_GID=1000 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Build-arg USER_UID/USER_GID not propagated to runtime ENV

ARG values are scoped to the stage they are declared in. ARG USER_UID=1000 / ARG USER_GID=1000 are declared in the base stage, but the production stage never re-declares them. As a result, the ENV block hardcodes the literal 1000 rather than forwarding the build-arg value.

Concrete failure scenario: docker build --build-arg USER_UID=1001 -t paperclip-test .

  1. base stage: usermod -u 1001 node runs correctly — node user is UID 1001 in the image layer.
  2. production stage: ENV USER_UID=1000 — runtime default is still 1000.
  3. Container start: entrypoint reads PUID=1000, sees id -u node is 1001, and runs usermod -o -u 1000 nodeundoing the build-time baking entirely.

The fix is to re-declare the ARGs in the production stage and use them in the ENV block:

ARG USER_UID=1000
ARG USER_GID=1000

ENV NODE_ENV=production \
  ...
  USER_UID=${USER_UID} \
  USER_GID=${USER_GID} \
  ...
Prompt To Fix With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 61-62

Comment:
**Build-arg `USER_UID`/`USER_GID` not propagated to runtime `ENV`**

`ARG` values are scoped to the stage they are declared in. `ARG USER_UID=1000` / `ARG USER_GID=1000` are declared in the `base` stage, but the `production` stage never re-declares them. As a result, the `ENV` block hardcodes the literal `1000` rather than forwarding the build-arg value.

**Concrete failure scenario:** `docker build --build-arg USER_UID=1001 -t paperclip-test .`

1. `base` stage: `usermod -u 1001 node` runs correctly — `node` user is UID 1001 in the image layer.
2. `production` stage: `ENV USER_UID=1000` — runtime default is still **1000**.
3. Container start: entrypoint reads `PUID=1000`, sees `id -u node` is 1001, and runs `usermod -o -u 1000 node`**undoing the build-time baking entirely**.

The fix is to re-declare the ARGs in the `production` stage and use them in the `ENV` block:

```
ARG USER_UID=1000
ARG USER_GID=1000

ENV NODE_ENV=production \
  ...
  USER_UID=${USER_UID} \
  USER_GID=${USER_GID} \
  ...
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push (268a75e). Re-declared ARG USER_UID=1000 / ARG USER_GID=1000 at the top of the production stage and changed the ENV block to USER_UID=${USER_UID} / USER_GID=${USER_GID} so the build-arg value is correctly forwarded to the runtime environment.

fi

# Ensure correct ownership of the data volume
chown -R node:node /paperclip
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 chown -R runs unconditionally on every container start

chown -R node:node /paperclip is not guarded behind the UID/GID-change branches — it executes on every startup, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."

For an AI agent orchestration platform, /paperclip accumulates persistent state (logs, databases, agent artefacts). A recursive chown over a large directory tree on every container restart introduces measurable startup latency that will grow over time.

The standard remedy is to move chown inside the branches so it only runs when a remap actually happened:

changed=0

if [ "$(id -u node)" -ne "$PUID" ]; then
    echo "Updating node UID to $PUID"
    usermod -o -u "$PUID" node
    changed=1
fi

if [ "$(id -g node)" -ne "$PGID" ]; then
    echo "Updating node GID to $PGID"
    groupmod -o -g "$PGID" node
    changed=1
fi

if [ "$changed" = "1" ]; then
    chown -R node:node /paperclip
fi
Prompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 20

Comment:
**`chown -R` runs unconditionally on every container start**

`chown -R node:node /paperclip` is not guarded behind the UID/GID-change branches — it executes on **every startup**, including the common default case where no remapping is needed. This contradicts the PR description's claim that "the entrypoint detects no UID/GID change and immediately exec-s the server with zero overhead."

For an AI agent orchestration platform, `/paperclip` accumulates persistent state (logs, databases, agent artefacts). A recursive `chown` over a large directory tree on every container restart introduces measurable startup latency that will grow over time.

The standard remedy is to move `chown` inside the branches so it only runs when a remap actually happened:

```sh
changed=0

if [ "$(id -u node)" -ne "$PUID" ]; then
    echo "Updating node UID to $PUID"
    usermod -o -u "$PUID" node
    changed=1
fi

if [ "$(id -g node)" -ne "$PGID" ]; then
    echo "Updating node GID to $PGID"
    groupmod -o -g "$PGID" node
    changed=1
fi

if [ "$changed" = "1" ]; then
    chown -R node:node /paperclip
fi
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push (268a75e). Added a changed=0 flag; chown -R node:node /paperclip now only runs when changed=1 (i.e. when usermod or groupmod actually executed). Default-UID containers skip the chown entirely and exec immediately.

@radiusred-cody radiusred-cody bot force-pushed the fix/docker-volumes branch from 3b88682 to 268a75e Compare March 27, 2026 16:59
@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

Updated the PR with fixes for both P1 Greptile findings:

  1. ARG scope — Re-declared ARG USER_UID/USER_GID in the production stage; ENV now uses ${USER_UID}/${USER_GID} so build-args correctly propagate to runtime.
  2. Conditional chown — Added changed=0 flag; chown -R node:node /paperclip only runs when a UID/GID remap actually occurred. Default deployments (UID 1000) skip the chown entirely.

Both fixes are in commit 268a75e.

1 similar comment
@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

Updated the PR with fixes for both P1 Greptile findings:

  1. ARG scope — Re-declared ARG USER_UID/USER_GID in the production stage; ENV now uses ${USER_UID}/${USER_GID} so build-args correctly propagate to runtime.
  2. Conditional chown — Added changed=0 flag; chown -R node:node /paperclip only runs when a UID/GID remap actually occurred. Default deployments (UID 1000) skip the chown entirely.

Both fixes are in commit 268a75e.

@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

@greptileai

Comment on lines +17 to +21

if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 GID remapping is silently broken

groupmod -o -g "$PGID" node updates the GID record in /etc/group but does not update the user's primary GID field stored in /etc/passwd. When exec gosu node "$@" runs, gosu calls getpwnam("node") which reads the primary GID from /etc/passwd — still the original value. The process therefore always starts with the old GID, regardless of what USER_GID is set to.

Concrete consequence: with USER_GID=1001:

  1. groupmod -o -g 1001 node/etc/group now says node:x:1001:
  2. chown -R node:node /paperclip → sets /paperclip gid to 1001 (resolving via /etc/group)
  3. exec gosu node → reads /etc/passwd, primary gid = 1000 (unchanged) → process runs as gid=1000
  4. New files created by the server inside /paperclip are owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them

The same issue affects the build-time groupmod step in the Dockerfile (line 11).

The fix is to call usermod -g "$PGID" node after groupmod, which updates the primary GID entry in /etc/passwd:

Suggested change
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
changed=1
if [ "$(id -g node)" -ne "$PGID" ]; then
echo "Updating node GID to $PGID"
groupmod -o -g "$PGID" node
usermod -g "$PGID" node
changed=1
fi

And in the Dockerfile build-time block:

RUN usermod -u $USER_UID --non-unique node \
  && groupmod -g $USER_GID --non-unique node \
  && usermod -g $USER_GID node
Prompt To Fix With AI
This is a comment left during a code review.
Path: docker-entrypoint.sh
Line: 17-21

Comment:
**GID remapping is silently broken**

`groupmod -o -g "$PGID" node` updates the GID record in `/etc/group` but does **not** update the user's primary GID field stored in `/etc/passwd`. When `exec gosu node "$@"` runs, `gosu` calls `getpwnam("node")` which reads the primary GID from `/etc/passwd` — still the original value. The process therefore always starts with the old GID, regardless of what `USER_GID` is set to.

Concrete consequence: with `USER_GID=1001`:
1. `groupmod -o -g 1001 node``/etc/group` now says `node:x:1001:`
2. `chown -R node:node /paperclip` → sets `/paperclip` gid to 1001 (resolving via `/etc/group`)
3. `exec gosu node` → reads `/etc/passwd`, primary gid = 1000 (unchanged) → process runs as gid=1000
4. New files created by the server inside `/paperclip` are owned by gid=1000, not the intended 1001 — the host user with gid=1001 has no group write access to them

The same issue affects the build-time `groupmod` step in the `Dockerfile` (line 11).

The fix is to call `usermod -g "$PGID" node` after `groupmod`, which updates the primary GID entry in `/etc/passwd`:

```suggestion
if [ "$(id -g node)" -ne "$PGID" ]; then
    echo "Updating node GID to $PGID"
    groupmod -o -g "$PGID" node
    usermod -g "$PGID" node
    changed=1
fi
```

And in the `Dockerfile` build-time block:
```dockerfile
RUN usermod -u $USER_UID --non-unique node \
  && groupmod -g $USER_GID --non-unique node \
  && usermod -g $USER_GID node
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the latest push (a0e94b4). Added usermod -g "$PGID" node after groupmod in docker-entrypoint.sh, and && usermod -g $USER_GID node after groupmod in the base stage Dockerfile. This updates the primary GID field in /etc/passwd so gosu node resolves the correct GID from the start.

@radiusred-cody radiusred-cody bot force-pushed the fix/docker-volumes branch from 268a75e to a0e94b4 Compare March 27, 2026 17:11
@davison
Copy link
Copy Markdown
Author

davison commented Mar 27, 2026

@greptileai

@davison
Copy link
Copy Markdown
Author

davison commented Mar 29, 2026

This (and the 2 queued up behind it awaiting this to be merged) will eventually address a number of open issues too:

The first one here should be a simple quality of life improvement for docker/podman users and has no effect on any functional part of the codebase. Greptile score is 5/5, could it be merged @cryppadotta 🙏🏻 ?

@davison davison marked this pull request as draft March 29, 2026 19:01
@davison davison marked this pull request as ready for review March 29, 2026 19:12
Add USER_UID/USER_GID build args and gosu-based entrypoint to remap
the node user's UID/GID at container startup, avoiding permission
errors when mounting a host directory as /paperclip.

- Build-time: usermod/groupmod adjust the node user to match build args
- Runtime: docker-entrypoint.sh re-adjusts UID/GID from env vars and
  drops privileges via gosu before starting the server

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@davison davison force-pushed the fix/docker-volumes branch from a0e94b4 to 3349f94 Compare March 29, 2026 23:54
@davison
Copy link
Copy Markdown
Author

davison commented Mar 30, 2026

@devinfoley @cryppadotta - any chance this can either be merged or could someone possibly let me know why it can't be? Getting no feedback here or discord. Just trying to help out a bit. I can obviously publish from my fork, but I'm not sure that really helps the wider community much 😞

There are others queued up behind this that might add material benefit, but trying to keep the PRs small and focused for reviews

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant